Overview
Dataset statistics
| Number of variables | 27 |
|---|---|
| Number of observations | 167278 |
| Missing cells | 911893 |
| Missing cells (%) | 20.2% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 245.2 MiB |
| Average record size in memory | 1.5 KiB |
Variable types
| Text | 15 |
|---|---|
| Categorical | 6 |
| DateTime | 2 |
| Boolean | 2 |
| Numeric | 2 |
EDUCATION_LEVEL_REQUIRED is highly overall correlated with VISA_CLASS | High correlation |
EXPERIENCE_REQUIRED_NUM_MONTHS is highly overall correlated with VISA_CLASS and 1 other fields | High correlation |
EXPERIENCE_REQUIRED_Y_N is highly overall correlated with VISA_CLASS | High correlation |
FULL_TIME_POSITION_Y_N is highly overall correlated with PAID_WAGE_SUBMITTED_UNIT and 1 other fields | High correlation |
PAID_WAGE_SUBMITTED_UNIT is highly overall correlated with FULL_TIME_POSITION_Y_N and 1 other fields | High correlation |
PREVAILING_WAGE_SUBMITTED_UNIT is highly overall correlated with FULL_TIME_POSITION_Y_N and 1 other fields | High correlation |
VISA_CLASS is highly overall correlated with EDUCATION_LEVEL_REQUIRED and 2 other fields | High correlation |
order is highly overall correlated with EXPERIENCE_REQUIRED_NUM_MONTHS | High correlation |
CASE_STATUS is highly imbalanced (60.2%) | Imbalance |
PREVAILING_WAGE_SUBMITTED_UNIT is highly imbalanced (86.5%) | Imbalance |
PAID_WAGE_SUBMITTED_UNIT is highly imbalanced (86.4%) | Imbalance |
FULL_TIME_POSITION_Y_N is highly imbalanced (84.2%) | Imbalance |
VISA_CLASS is highly imbalanced (81.0%) | Imbalance |
EDUCATION_LEVEL_REQUIRED has 156215 (93.4%) missing values | Missing |
COLLEGE_MAJOR_REQUIRED has 156227 (93.4%) missing values | Missing |
EXPERIENCE_REQUIRED_Y_N has 156185 (93.4%) missing values | Missing |
EXPERIENCE_REQUIRED_NUM_MONTHS has 162313 (97.0%) missing values | Missing |
COUNTRY_OF_CITIZENSHIP has 156185 (93.4%) missing values | Missing |
WORK_POSTAL_CODE has 113604 (67.9%) missing values | Missing |
FULL_TIME_POSITION_Y_N has 11093 (6.6%) missing values | Missing |
order is uniformly distributed | Uniform |
CASE_NUMBER has unique values | Unique |
order has unique values | Unique |
Reproduction
| Analysis started | 2026-01-19 13:19:42.032721 |
|---|---|
| Analysis finished | 2026-01-19 13:19:54.426296 |
| Duration | 12.39 seconds |
| Software version | ydata-profiling vv4.18.0 |
| Download configuration | config.json |
Variables
CASE_NUMBER
Text
Unique
| Distinct | 167278 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 11.9 MiB |
Length
| Max length | 18 |
|---|---|
| Median length | 18 |
| Mean length | 17.668426 |
| Min length | 13 |
Unique
| Unique | 167278 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | I-200-14073-248840 |
|---|---|
| 2nd row | A-15061-55212 |
| 3rd row | I-200-13256-001092 |
| 4th row | I-200-14087-353657 |
| 5th row | I-203-14259-128844 |
| Value | Count | Frequency (%) |
| i-200-14073-248840 | 1 | < 0.1% |
| i-200-13311-440603 | 1 | < 0.1% |
| i-200-13273-900548 | 1 | < 0.1% |
| i-200-13274-808058 | 1 | < 0.1% |
| i-200-14069-400950 | 1 | < 0.1% |
| i-200-13256-001092 | 1 | < 0.1% |
| i-200-14087-353657 | 1 | < 0.1% |
| i-203-14259-128844 | 1 | < 0.1% |
| i-200-14092-483272 | 1 | < 0.1% |
| i-200-13084-487292 | 1 | < 0.1% |
| Other values (167268) | 167268 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 523625 | |
| - | 490741 | |
| 1 | 339562 | |
| 2 | 328811 | |
| 3 | 201849 | 6.8% |
| 4 | 196955 | 6.7% |
| 5 | 163600 | 5.5% |
| I | 156185 | 5.3% |
| 7 | 144688 | 4.9% |
| 6 | 139716 | 4.7% |
| Other values (4) | 269807 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2955539 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 523625 | |
| - | 490741 | |
| 1 | 339562 | |
| 2 | 328811 | |
| 3 | 201849 | 6.8% |
| 4 | 196955 | 6.7% |
| 5 | 163600 | 5.5% |
| I | 156185 | 5.3% |
| 7 | 144688 | 4.9% |
| 6 | 139716 | 4.7% |
| Other values (4) | 269807 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2955539 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 523625 | |
| - | 490741 | |
| 1 | 339562 | |
| 2 | 328811 | |
| 3 | 201849 | 6.8% |
| 4 | 196955 | 6.7% |
| 5 | 163600 | 5.5% |
| I | 156185 | 5.3% |
| 7 | 144688 | 4.9% |
| 6 | 139716 | 4.7% |
| Other values (4) | 269807 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2955539 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 523625 | |
| - | 490741 | |
| 1 | 339562 | |
| 2 | 328811 | |
| 3 | 201849 | 6.8% |
| 4 | 196955 | 6.7% |
| 5 | 163600 | 5.5% |
| I | 156185 | 5.3% |
| 7 | 144688 | 4.9% |
| 6 | 139716 | 4.7% |
| Other values (4) | 269807 |
CASE_STATUS
Categorical
Imbalance
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.7 MiB |
| certified | |
|---|---|
| certified-withdrawn | |
| withdrawn | 5602 |
| denied | 4273 |
| certified-expired | 3226 |
Length
| Max length | 19 |
|---|---|
| Median length | 9 |
| Mean length | 9.9233073 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | denied |
|---|---|
| 2nd row | denied |
| 3rd row | denied |
| 4th row | denied |
| 5th row | denied |
Common Values
| Value | Count | Frequency (%) |
| certified | 140031 | |
| certified-withdrawn | 14146 | 8.5% |
| withdrawn | 5602 | 3.3% |
| denied | 4273 | 2.6% |
| certified-expired | 3226 | 1.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| certified | 140031 | |
| certified-withdrawn | 14146 | 8.5% |
| withdrawn | 5602 | 3.3% |
| denied | 4273 | 2.6% |
| certified-expired | 3226 | 1.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 342053 | |
| e | 329804 | |
| d | 188923 | |
| r | 180377 | |
| t | 177151 | |
| c | 157403 | |
| f | 157403 | |
| w | 39496 | 2.4% |
| n | 24021 | 1.4% |
| h | 19748 | 1.2% |
| Other values (4) | 43572 | 2.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1659951 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 342053 | |
| e | 329804 | |
| d | 188923 | |
| r | 180377 | |
| t | 177151 | |
| c | 157403 | |
| f | 157403 | |
| w | 39496 | 2.4% |
| n | 24021 | 1.4% |
| h | 19748 | 1.2% |
| Other values (4) | 43572 | 2.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1659951 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 342053 | |
| e | 329804 | |
| d | 188923 | |
| r | 180377 | |
| t | 177151 | |
| c | 157403 | |
| f | 157403 | |
| w | 39496 | 2.4% |
| n | 24021 | 1.4% |
| h | 19748 | 1.2% |
| Other values (4) | 43572 | 2.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1659951 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 342053 | |
| e | 329804 | |
| d | 188923 | |
| r | 180377 | |
| t | 177151 | |
| c | 157403 | |
| f | 157403 | |
| w | 39496 | 2.4% |
| n | 24021 | 1.4% |
| h | 19748 | 1.2% |
| Other values (4) | 43572 | 2.6% |
CASE_RECEIVED_DATE
Date
| Distinct | 1769 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| Minimum | 2008-07-16 00:00:00 |
|---|---|
| Maximum | 2015-06-29 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
DECISION_DATE
Date
| Distinct | 874 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| Minimum | 2011-10-03 00:00:00 |
|---|---|
| Maximum | 2015-06-30 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
EMPLOYER_NAME
Text
| Distinct | 23773 |
|---|---|
| Distinct (%) | 14.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.7 MiB |
Length
| Max length | 70 |
|---|---|
| Median length | 59 |
| Mean length | 22.597837 |
| Min length | 2 |
Unique
| Unique | 11002 ? |
|---|---|
| Unique (%) | 6.6% |
Sample
| 1st row | ADVANCED TECHNOLOGY GROUP USA, INC. |
|---|---|
| 2nd row | SAN FRANCISCO STATE UNIVERSITY |
| 3rd row | CAROUSEL SCHOOL |
| 4th row | HARLINGEN CONSOLIDATED INDEPENDENT SCHOOL DISTRICT |
| 5th row | SIGNAL SCIENCES CORPORATION |
| Value | Count | Frequency (%) |
| inc | 91608 | 17.3% |
| llc | 16521 | 3.1% |
| university | 16301 | 3.1% |
| corporation | 13834 | 2.6% |
| of | 12578 | 2.4% |
| technologies | 9998 | 1.9% |
| systems | 8002 | 1.5% |
| solutions | 7903 | 1.5% |
| school | 7119 | 1.3% |
| services | 6525 | 1.2% |
| Other values (17402) | 338991 |
Most occurring characters
| Value | Count | Frequency (%) |
| 362355 | 9.6% | |
| I | 351701 | 9.3% |
| N | 310465 | 8.2% |
| E | 279696 | 7.4% |
| O | 279110 | 7.4% |
| C | 276296 | 7.3% |
| T | 241073 | 6.4% |
| S | 232804 | 6.2% |
| A | 214875 | 5.7% |
| R | 188413 | 5.0% |
| Other values (79) | 1043333 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3780121 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 362355 | 9.6% | |
| I | 351701 | 9.3% |
| N | 310465 | 8.2% |
| E | 279696 | 7.4% |
| O | 279110 | 7.4% |
| C | 276296 | 7.3% |
| T | 241073 | 6.4% |
| S | 232804 | 6.2% |
| A | 214875 | 5.7% |
| R | 188413 | 5.0% |
| Other values (79) | 1043333 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3780121 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 362355 | 9.6% | |
| I | 351701 | 9.3% |
| N | 310465 | 8.2% |
| E | 279696 | 7.4% |
| O | 279110 | 7.4% |
| C | 276296 | 7.3% |
| T | 241073 | 6.4% |
| S | 232804 | 6.2% |
| A | 214875 | 5.7% |
| R | 188413 | 5.0% |
| Other values (79) | 1043333 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3780121 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 362355 | 9.6% | |
| I | 351701 | 9.3% |
| N | 310465 | 8.2% |
| E | 279696 | 7.4% |
| O | 279110 | 7.4% |
| C | 276296 | 7.3% |
| T | 241073 | 6.4% |
| S | 232804 | 6.2% |
| A | 214875 | 5.7% |
| R | 188413 | 5.0% |
| Other values (79) | 1043333 |
| Distinct | 22509 |
|---|---|
| Distinct (%) | 13.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.2 MiB |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 6.8641603 |
| Min length | 2 |
Unique
| Unique | 9139 ? |
|---|---|
| Unique (%) | 5.5% |
Sample
| 1st row | 6217100 |
|---|---|
| 2nd row | 5067600 |
| 3rd row | 4947000 |
| 4th row | 251052.00 |
| 5th row | 84573.00 |
| Value | Count | Frequency (%) |
| 98675 | 908 | 0.5% |
| 88254.00 | 839 | 0.5% |
| 98675.00 | 796 | 0.5% |
| 109762.00 | 779 | 0.5% |
| 97219.00 | 763 | 0.5% |
| 94162.00 | 606 | 0.4% |
| 80746.00 | 603 | 0.4% |
| 93267.00 | 601 | 0.4% |
| 62379.00 | 497 | 0.3% |
| 116605 | 489 | 0.3% |
| Other values (22483) | 160397 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 290460 | |
| . | 103822 | 9.0% |
| 6 | 95532 | 8.3% |
| 1 | 91698 | 8.0% |
| 7 | 89849 | 7.8% |
| 4 | 86077 | 7.5% |
| 8 | 82774 | 7.2% |
| 9 | 78968 | 6.9% |
| 2 | 78434 | 6.8% |
| 5 | 78157 | 6.8% |
| Other values (2) | 72452 | 6.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1148223 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 290460 | |
| . | 103822 | 9.0% |
| 6 | 95532 | 8.3% |
| 1 | 91698 | 8.0% |
| 7 | 89849 | 7.8% |
| 4 | 86077 | 7.5% |
| 8 | 82774 | 7.2% |
| 9 | 78968 | 6.9% |
| 2 | 78434 | 6.8% |
| 5 | 78157 | 6.8% |
| Other values (2) | 72452 | 6.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1148223 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 290460 | |
| . | 103822 | 9.0% |
| 6 | 95532 | 8.3% |
| 1 | 91698 | 8.0% |
| 7 | 89849 | 7.8% |
| 4 | 86077 | 7.5% |
| 8 | 82774 | 7.2% |
| 9 | 78968 | 6.9% |
| 2 | 78434 | 6.8% |
| 5 | 78157 | 6.8% |
| Other values (2) | 72452 | 6.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1148223 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 290460 | |
| . | 103822 | 9.0% |
| 6 | 95532 | 8.3% |
| 1 | 91698 | 8.0% |
| 7 | 89849 | 7.8% |
| 4 | 86077 | 7.5% |
| 8 | 82774 | 7.2% |
| 9 | 78968 | 6.9% |
| 2 | 78434 | 6.8% |
| 5 | 78157 | 6.8% |
| Other values (2) | 72452 | 6.3% |
PREVAILING_WAGE_SUBMITTED_UNIT
Categorical
High correlation Imbalance
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.7 MiB |
| year | |
|---|---|
| hour | 8500 |
| month | 292 |
| week | 59 |
| bi-weekly | 14 |
Length
| Max length | 9 |
|---|---|
| Median length | 4 |
| Mean length | 4.0021641 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | year |
|---|---|
| 2nd row | year |
| 3rd row | year |
| 4th row | month |
| 5th row | bi-weekly |
Common Values
| Value | Count | Frequency (%) |
| year | 158413 | |
| hour | 8500 | 5.1% |
| month | 292 | 0.2% |
| week | 59 | < 0.1% |
| bi-weekly | 14 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| year | 158413 | |
| hour | 8500 | 5.1% |
| month | 292 | 0.2% |
| week | 59 | < 0.1% |
| bi-weekly | 14 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 166913 | |
| e | 158559 | |
| y | 158427 | |
| a | 158413 | |
| h | 8792 | 1.3% |
| o | 8792 | 1.3% |
| u | 8500 | 1.3% |
| m | 292 | < 0.1% |
| n | 292 | < 0.1% |
| t | 292 | < 0.1% |
| Other values (6) | 202 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 669474 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| r | 166913 | |
| e | 158559 | |
| y | 158427 | |
| a | 158413 | |
| h | 8792 | 1.3% |
| o | 8792 | 1.3% |
| u | 8500 | 1.3% |
| m | 292 | < 0.1% |
| n | 292 | < 0.1% |
| t | 292 | < 0.1% |
| Other values (6) | 202 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 669474 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| r | 166913 | |
| e | 158559 | |
| y | 158427 | |
| a | 158413 | |
| h | 8792 | 1.3% |
| o | 8792 | 1.3% |
| u | 8500 | 1.3% |
| m | 292 | < 0.1% |
| n | 292 | < 0.1% |
| t | 292 | < 0.1% |
| Other values (6) | 202 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 669474 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| r | 166913 | |
| e | 158559 | |
| y | 158427 | |
| a | 158413 | |
| h | 8792 | 1.3% |
| o | 8792 | 1.3% |
| u | 8500 | 1.3% |
| m | 292 | < 0.1% |
| n | 292 | < 0.1% |
| t | 292 | < 0.1% |
| Other values (6) | 202 | < 0.1% |
PAID_WAGE_SUBMITTED
Text
| Distinct | 23031 |
|---|---|
| Distinct (%) | 13.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.0 MiB |
Length
| Max length | 10 |
|---|---|
| Median length | 5 |
| Mean length | 5.4854015 |
| Min length | 1 |
Unique
| Unique | 13615 ? |
|---|---|
| Unique (%) | 8.1% |
Sample
| 1st row | 62171 |
|---|---|
| 2nd row | 91440 |
| 3rd row | 49470 |
| 4th row | 43800 |
| 5th row | 170000 |
| Value | Count | Frequency (%) |
| 60000 | 6967 | 4.2% |
| 65000 | 3887 | 2.3% |
| 70000 | 3580 | 2.1% |
| 90000 | 3154 | 1.9% |
| 100000 | 3094 | 1.8% |
| 80000 | 3091 | 1.8% |
| 75000 | 2825 | 1.7% |
| 85000 | 2587 | 1.5% |
| 110000 | 2455 | 1.5% |
| 105000 | 2392 | 1.4% |
| Other values (23021) | 133246 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 386578 | |
| 1 | 82983 | 9.0% |
| 5 | 69516 | 7.6% |
| 6 | 66452 | 7.2% |
| 7 | 56802 | 6.2% |
| 8 | 53156 | 5.8% |
| 2 | 50188 | 5.5% |
| 4 | 44240 | 4.8% |
| 9 | 44075 | 4.8% |
| 3 | 41815 | 4.6% |
| Other values (2) | 21782 | 2.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 917587 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 386578 | |
| 1 | 82983 | 9.0% |
| 5 | 69516 | 7.6% |
| 6 | 66452 | 7.2% |
| 7 | 56802 | 6.2% |
| 8 | 53156 | 5.8% |
| 2 | 50188 | 5.5% |
| 4 | 44240 | 4.8% |
| 9 | 44075 | 4.8% |
| 3 | 41815 | 4.6% |
| Other values (2) | 21782 | 2.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 917587 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 386578 | |
| 1 | 82983 | 9.0% |
| 5 | 69516 | 7.6% |
| 6 | 66452 | 7.2% |
| 7 | 56802 | 6.2% |
| 8 | 53156 | 5.8% |
| 2 | 50188 | 5.5% |
| 4 | 44240 | 4.8% |
| 9 | 44075 | 4.8% |
| 3 | 41815 | 4.6% |
| Other values (2) | 21782 | 2.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 917587 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 386578 | |
| 1 | 82983 | 9.0% |
| 5 | 69516 | 7.6% |
| 6 | 66452 | 7.2% |
| 7 | 56802 | 6.2% |
| 8 | 53156 | 5.8% |
| 2 | 50188 | 5.5% |
| 4 | 44240 | 4.8% |
| 9 | 44075 | 4.8% |
| 3 | 41815 | 4.6% |
| Other values (2) | 21782 | 2.4% |
PAID_WAGE_SUBMITTED_UNIT
Categorical
High correlation Imbalance
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.7 MiB |
| year | |
|---|---|
| hour | 8397 |
| month | 358 |
| week | 62 |
| bi-weekly | 33 |
Length
| Max length | 9 |
|---|---|
| Median length | 4 |
| Mean length | 4.0031265 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | year |
|---|---|
| 2nd row | year |
| 3rd row | year |
| 4th row | year |
| 5th row | year |
Common Values
| Value | Count | Frequency (%) |
| year | 158428 | |
| hour | 8397 | 5.0% |
| month | 358 | 0.2% |
| week | 62 | < 0.1% |
| bi-weekly | 33 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| year | 158428 | |
| hour | 8397 | 5.0% |
| month | 358 | 0.2% |
| week | 62 | < 0.1% |
| bi-weekly | 33 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 166825 | |
| e | 158618 | |
| y | 158461 | |
| a | 158428 | |
| h | 8755 | 1.3% |
| o | 8755 | 1.3% |
| u | 8397 | 1.3% |
| m | 358 | 0.1% |
| n | 358 | 0.1% |
| t | 358 | 0.1% |
| Other values (6) | 322 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 669635 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| r | 166825 | |
| e | 158618 | |
| y | 158461 | |
| a | 158428 | |
| h | 8755 | 1.3% |
| o | 8755 | 1.3% |
| u | 8397 | 1.3% |
| m | 358 | 0.1% |
| n | 358 | 0.1% |
| t | 358 | 0.1% |
| Other values (6) | 322 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 669635 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| r | 166825 | |
| e | 158618 | |
| y | 158461 | |
| a | 158428 | |
| h | 8755 | 1.3% |
| o | 8755 | 1.3% |
| u | 8397 | 1.3% |
| m | 358 | 0.1% |
| n | 358 | 0.1% |
| t | 358 | 0.1% |
| Other values (6) | 322 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 669635 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| r | 166825 | |
| e | 158618 | |
| y | 158461 | |
| a | 158428 | |
| h | 8755 | 1.3% |
| o | 8755 | 1.3% |
| u | 8397 | 1.3% |
| m | 358 | 0.1% |
| n | 358 | 0.1% |
| t | 358 | 0.1% |
| Other values (6) | 322 | < 0.1% |
JOB_TITLE
Text
| Distinct | 12589 |
|---|---|
| Distinct (%) | 7.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.6 MiB |
Length
| Max length | 94 |
|---|---|
| Median length | 91 |
| Mean length | 21.678852 |
| Min length | 7 |
Unique
| Unique | 8218 ? |
|---|---|
| Unique (%) | 4.9% |
Sample
| 1st row | SOFTWARE ENGINEER |
|---|---|
| 2nd row | Assistant Professor of Marketing |
| 3rd row | SPECIAL EDUCATION TEACHER |
| 4th row | SCIENCE TEACHER |
| 5th row | SENIOR SOFTWARE ENGINEER |
| Value | Count | Frequency (%) |
| software | 99217 | |
| engineer | 96969 | |
| analyst | 31579 | 7.2% |
| business | 27846 | 6.3% |
| senior | 19028 | 4.3% |
| assistant | 18933 | 4.3% |
| professor | 18531 | 4.2% |
| teacher | 13515 | 3.1% |
| data | 5241 | 1.2% |
| 5235 | 1.2% | |
| Other values (4225) | 104126 |
Most occurring characters
| Value | Count | Frequency (%) |
| E | 534851 | |
| S | 372932 | |
| N | 326319 | |
| R | 288600 | 8.0% |
| A | 277283 | 7.6% |
| 273097 | 7.5% | |
| T | 229430 | 6.3% |
| I | 218543 | 6.0% |
| O | 186451 | 5.1% |
| F | 122777 | 3.4% |
| Other values (65) | 796112 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3626395 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| E | 534851 | |
| S | 372932 | |
| N | 326319 | |
| R | 288600 | 8.0% |
| A | 277283 | 7.6% |
| 273097 | 7.5% | |
| T | 229430 | 6.3% |
| I | 218543 | 6.0% |
| O | 186451 | 5.1% |
| F | 122777 | 3.4% |
| Other values (65) | 796112 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3626395 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| E | 534851 | |
| S | 372932 | |
| N | 326319 | |
| R | 288600 | 8.0% |
| A | 277283 | 7.6% |
| 273097 | 7.5% | |
| T | 229430 | 6.3% |
| I | 218543 | 6.0% |
| O | 186451 | 5.1% |
| F | 122777 | 3.4% |
| Other values (65) | 796112 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3626395 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| E | 534851 | |
| S | 372932 | |
| N | 326319 | |
| R | 288600 | 8.0% |
| A | 277283 | 7.6% |
| 273097 | 7.5% | |
| T | 229430 | 6.3% |
| I | 218543 | 6.0% |
| O | 186451 | 5.1% |
| F | 122777 | 3.4% |
| Other values (65) | 796112 |
WORK_CITY
Text
| Distinct | 4888 |
|---|---|
| Distinct (%) | 2.9% |
| Missing | 3 |
| Missing (%) | < 0.1% |
| Memory size | 10.6 MiB |
Length
| Max length | 44 |
|---|---|
| Median length | 26 |
| Mean length | 9.1463937 |
| Min length | 2 |
Unique
| Unique | 1615 ? |
|---|---|
| Unique (%) | 1.0% |
Sample
| 1st row | BLOOMINGTON |
|---|---|
| 2nd row | SAN FRANCISCO |
| 3rd row | LOS ANGELES |
| 4th row | HARLINGEN CISD |
| 5th row | PORTLAND |
| Value | Count | Frequency (%) |
| san | 16892 | 7.2% |
| new | 7743 | 3.3% |
| view | 7459 | 3.2% |
| mountain | 7418 | 3.2% |
| york | 7116 | 3.0% |
| francisco | 6847 | 2.9% |
| city | 5776 | 2.5% |
| jose | 3757 | 1.6% |
| santa | 3661 | 1.6% |
| diego | 3332 | 1.4% |
| Other values (3409) | 163923 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 149438 | 9.8% |
| N | 137654 | 9.0% |
| O | 120134 | 7.9% |
| E | 110301 | 7.2% |
| I | 97159 | 6.4% |
| S | 96910 | 6.3% |
| L | 89550 | 5.9% |
| R | 87077 | 5.7% |
| T | 83359 | 5.4% |
| 66665 | 4.4% | |
| Other values (56) | 491716 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1529963 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| A | 149438 | 9.8% |
| N | 137654 | 9.0% |
| O | 120134 | 7.9% |
| E | 110301 | 7.2% |
| I | 97159 | 6.4% |
| S | 96910 | 6.3% |
| L | 89550 | 5.9% |
| R | 87077 | 5.7% |
| T | 83359 | 5.4% |
| 66665 | 4.4% | |
| Other values (56) | 491716 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1529963 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| A | 149438 | 9.8% |
| N | 137654 | 9.0% |
| O | 120134 | 7.9% |
| E | 110301 | 7.2% |
| I | 97159 | 6.4% |
| S | 96910 | 6.3% |
| L | 89550 | 5.9% |
| R | 87077 | 5.7% |
| T | 83359 | 5.4% |
| 66665 | 4.4% | |
| Other values (56) | 491716 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1529963 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| A | 149438 | 9.8% |
| N | 137654 | 9.0% |
| O | 120134 | 7.9% |
| E | 110301 | 7.2% |
| I | 97159 | 6.4% |
| S | 96910 | 6.3% |
| L | 89550 | 5.9% |
| R | 87077 | 5.7% |
| T | 83359 | 5.4% |
| 66665 | 4.4% | |
| Other values (56) | 491716 |
EDUCATION_LEVEL_REQUIRED
Categorical
High correlation Missing
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 156215 |
| Missing (%) | 93.4% |
| Memory size | 10.2 MiB |
| Master's | |
|---|---|
| Bachelor's | |
| Doctorate | |
| Other | 366 |
| Associate's | 21 |
Length
| Max length | 11 |
|---|---|
| Median length | 8 |
| Mean length | 8.727018 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Doctorate |
|---|---|
| 2nd row | Bachelor's |
| 3rd row | Bachelor's |
| 4th row | Master's |
| 5th row | Other |
Common Values
| Value | Count | Frequency (%) |
| Master's | 5550 | 3.3% |
| Bachelor's | 3938 | 2.4% |
| Doctorate | 1181 | 0.7% |
| Other | 366 | 0.2% |
| Associate's | 21 | < 0.1% |
| High School | 7 | < 0.1% |
| (Missing) | 156215 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| master's | 5550 | |
| bachelor's | 3938 | |
| doctorate | 1181 | 10.7% |
| other | 366 | 3.3% |
| associate's | 21 | 0.2% |
| high | 7 | 0.1% |
| school | 7 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 15101 | |
| e | 11056 | |
| r | 11035 | |
| a | 10690 | |
| ' | 9509 | |
| t | 8299 | |
| o | 6335 | |
| M | 5550 | 5.7% |
| c | 5147 | 5.3% |
| h | 4318 | 4.5% |
| Other values (10) | 9507 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 96547 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| s | 15101 | |
| e | 11056 | |
| r | 11035 | |
| a | 10690 | |
| ' | 9509 | |
| t | 8299 | |
| o | 6335 | |
| M | 5550 | 5.7% |
| c | 5147 | 5.3% |
| h | 4318 | 4.5% |
| Other values (10) | 9507 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 96547 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| s | 15101 | |
| e | 11056 | |
| r | 11035 | |
| a | 10690 | |
| ' | 9509 | |
| t | 8299 | |
| o | 6335 | |
| M | 5550 | 5.7% |
| c | 5147 | 5.3% |
| h | 4318 | 4.5% |
| Other values (10) | 9507 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 96547 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| s | 15101 | |
| e | 11056 | |
| r | 11035 | |
| a | 10690 | |
| ' | 9509 | |
| t | 8299 | |
| o | 6335 | |
| M | 5550 | 5.7% |
| c | 5147 | 5.3% |
| h | 4318 | 4.5% |
| Other values (10) | 9507 |
Missing
| Distinct | 3261 |
|---|---|
| Distinct (%) | 29.5% |
| Missing | 156227 |
| Missing (%) | 93.4% |
| Memory size | 5.8 MiB |
Length
| Max length | 100 |
|---|---|
| Median length | 85 |
| Mean length | 41.916659 |
| Min length | 2 |
Unique
| Unique | 2529 ? |
|---|---|
| Unique (%) | 22.9% |
Sample
| 1st row | marketing |
|---|---|
| 2nd row | computer science, electrical engineering |
| 3rd row | computer science, electrical engineering, or a related field |
| 4th row | electronic eng, computer sci, computer eng, imaging or related field |
| 5th row | medicine |
| Value | Count | Frequency (%) |
| or | 8561 | |
| computer | 7571 | 12.1% |
| science | 5617 | 9.0% |
| related | 5404 | 8.7% |
| engineering | 3832 | 6.1% |
| field | 3670 | 5.9% |
| comp | 2137 | 3.4% |
| eng | 1725 | 2.8% |
| sci | 1631 | 2.6% |
| electrical | 1627 | 2.6% |
| Other values (1281) | 20649 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 65971 | |
| 51431 | ||
| c | 36868 | 8.0% |
| i | 33542 | 7.2% |
| n | 32352 | 7.0% |
| r | 31978 | 6.9% |
| o | 25681 | 5.5% |
| t | 23671 | 5.1% |
| s | 19143 | 4.1% |
| l | 18457 | 4.0% |
| Other values (38) | 124127 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 463221 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 65971 | |
| 51431 | ||
| c | 36868 | 8.0% |
| i | 33542 | 7.2% |
| n | 32352 | 7.0% |
| r | 31978 | 6.9% |
| o | 25681 | 5.5% |
| t | 23671 | 5.1% |
| s | 19143 | 4.1% |
| l | 18457 | 4.0% |
| Other values (38) | 124127 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 463221 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 65971 | |
| 51431 | ||
| c | 36868 | 8.0% |
| i | 33542 | 7.2% |
| n | 32352 | 7.0% |
| r | 31978 | 6.9% |
| o | 25681 | 5.5% |
| t | 23671 | 5.1% |
| s | 19143 | 4.1% |
| l | 18457 | 4.0% |
| Other values (38) | 124127 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 463221 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 65971 | |
| 51431 | ||
| c | 36868 | 8.0% |
| i | 33542 | 7.2% |
| n | 32352 | 7.0% |
| r | 31978 | 6.9% |
| o | 25681 | 5.5% |
| t | 23671 | 5.1% |
| s | 19143 | 4.1% |
| l | 18457 | 4.0% |
| Other values (38) | 124127 |
EXPERIENCE_REQUIRED_Y_N
Boolean
High correlation Missing
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 156185 |
| Missing (%) | 93.4% |
| Memory size | 326.8 KiB |
| False | 6139 |
|---|---|
| True | 4954 |
| (Missing) |
| Value | Count | Frequency (%) |
| False | 6139 | 3.7% |
| True | 4954 | 3.0% |
| (Missing) | 156185 |
EXPERIENCE_REQUIRED_NUM_MONTHS
Real number (ℝ)
High correlation Missing
| Distinct | 26 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 162313 |
| Missing (%) | 97.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 34.692044 |
| Minimum | 0 |
|---|---|
| Maximum | 144 |
| Zeros | 11 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 6 |
| Q1 | 12 |
| median | 24 |
| Q3 | 60 |
| 95-th percentile | 60 |
| Maximum | 144 |
| Range | 144 |
| Interquartile range (IQR) | 48 |
Descriptive statistics
| Standard deviation | 22.317783 |
|---|---|
| Coefficient of variation (CV) | 0.64331126 |
| Kurtosis | -0.7543936 |
| Mean | 34.692044 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 0.40818635 |
| Sum | 172246 |
| Variance | 498.08343 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 60 | 1604 | 1.0% |
| 24 | 1105 | 0.7% |
| 12 | 938 | 0.6% |
| 36 | 561 | 0.3% |
| 6 | 454 | 0.3% |
| 48 | 78 | < 0.1% |
| 72 | 46 | < 0.1% |
| 84 | 39 | < 0.1% |
| 96 | 28 | < 0.1% |
| 3 | 20 | < 0.1% |
| Other values (16) | 92 | 0.1% |
| (Missing) | 162313 |
| Value | Count | Frequency (%) |
| 0 | 11 | < 0.1% |
| 1 | 15 | < 0.1% |
| 2 | 3 | < 0.1% |
| 3 | 20 | < 0.1% |
| 4 | 18 | < 0.1% |
| 5 | 2 | < 0.1% |
| 6 | 454 | |
| 8 | 1 | < 0.1% |
| 9 | 9 | < 0.1% |
| 10 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 144 | 1 | < 0.1% |
| 120 | 9 | < 0.1% |
| 108 | 4 | < 0.1% |
| 96 | 28 | < 0.1% |
| 84 | 39 | < 0.1% |
| 72 | 46 | < 0.1% |
| 60 | 1604 | |
| 48 | 78 | < 0.1% |
| 42 | 1 | < 0.1% |
| 36 | 561 | 0.3% |
Missing
| Distinct | 134 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 156185 |
| Missing (%) | 93.4% |
| Memory size | 5.4 MiB |
Length
| Max length | 32 |
|---|---|
| Median length | 5 |
| Mean length | 5.8120436 |
| Min length | 4 |
Unique
| Unique | 31 ? |
|---|---|
| Unique (%) | 0.3% |
Sample
| 1st row | IRAN |
|---|---|
| 2nd row | INDIA |
| 3rd row | INDIA |
| 4th row | FRANCE |
| 5th row | JAPAN |
| Value | Count | Frequency (%) |
| india | 6587 | |
| china | 1101 | 9.5% |
| canada | 425 | 3.7% |
| philippines | 300 | 2.6% |
| south | 284 | 2.4% |
| korea | 269 | 2.3% |
| mexico | 255 | 2.2% |
| taiwan | 110 | 0.9% |
| russia | 109 | 0.9% |
| france | 100 | 0.9% |
| Other values (147) | 2083 | 17.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| I | 16953 | |
| A | 11825 | |
| N | 10011 | |
| D | 7425 | |
| C | 2137 | 3.3% |
| E | 2124 | 3.3% |
| H | 1840 | 2.9% |
| R | 1409 | 2.2% |
| O | 1334 | 2.1% |
| P | 1327 | 2.1% |
| Other values (20) | 8088 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 64473 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| I | 16953 | |
| A | 11825 | |
| N | 10011 | |
| D | 7425 | |
| C | 2137 | 3.3% |
| E | 2124 | 3.3% |
| H | 1840 | 2.9% |
| R | 1409 | 2.2% |
| O | 1334 | 2.1% |
| P | 1327 | 2.1% |
| Other values (20) | 8088 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 64473 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| I | 16953 | |
| A | 11825 | |
| N | 10011 | |
| D | 7425 | |
| C | 2137 | 3.3% |
| E | 2124 | 3.3% |
| H | 1840 | 2.9% |
| R | 1409 | 2.2% |
| O | 1334 | 2.1% |
| P | 1327 | 2.1% |
| Other values (20) | 8088 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 64473 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| I | 16953 | |
| A | 11825 | |
| N | 10011 | |
| D | 7425 | |
| C | 2137 | 3.3% |
| E | 2124 | 3.3% |
| H | 1840 | 2.9% |
| R | 1409 | 2.2% |
| O | 1334 | 2.1% |
| P | 1327 | 2.1% |
| Other values (20) | 8088 |
| Distinct | 410 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.2 MiB |
Length
| Max length | 10 |
|---|---|
| Median length | 7 |
| Mean length | 7.0643898 |
| Min length | 6 |
Unique
| Unique | 76 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 15-1132 |
|---|---|
| 2nd row | 25-1011 |
| 3rd row | 25-2052 |
| 4th row | 25-1042 |
| 5th row | 15-1133 |
| Value | Count | Frequency (%) |
| 15-1132 | 73136 | |
| 15-1121 | 16939 | 10.1% |
| 15-1133 | 16693 | 10.0% |
| 13-1111 | 7984 | 4.8% |
| 25-2031 | 3919 | 2.3% |
| 25-2021 | 3707 | 2.2% |
| 15-1131 | 3698 | 2.2% |
| 15-1199 | 2729 | 1.6% |
| 25-1071 | 2605 | 1.6% |
| 15-2031 | 2111 | 1.3% |
| Other values (400) | 33757 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 480566 | |
| - | 167237 | 14.2% |
| 2 | 159998 | 13.5% |
| 5 | 154192 | 13.0% |
| 3 | 137042 | 11.6% |
| 0 | 45056 | 3.8% |
| 9 | 17911 | 1.5% |
| 4 | 5525 | 0.5% |
| 7 | 4941 | 0.4% |
| 6 | 4857 | 0.4% |
| Other values (2) | 4392 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1181717 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 480566 | |
| - | 167237 | 14.2% |
| 2 | 159998 | 13.5% |
| 5 | 154192 | 13.0% |
| 3 | 137042 | 11.6% |
| 0 | 45056 | 3.8% |
| 9 | 17911 | 1.5% |
| 4 | 5525 | 0.5% |
| 7 | 4941 | 0.4% |
| 6 | 4857 | 0.4% |
| Other values (2) | 4392 | 0.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1181717 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 480566 | |
| - | 167237 | 14.2% |
| 2 | 159998 | 13.5% |
| 5 | 154192 | 13.0% |
| 3 | 137042 | 11.6% |
| 0 | 45056 | 3.8% |
| 9 | 17911 | 1.5% |
| 4 | 5525 | 0.5% |
| 7 | 4941 | 0.4% |
| 6 | 4857 | 0.4% |
| Other values (2) | 4392 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1181717 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 480566 | |
| - | 167237 | 14.2% |
| 2 | 159998 | 13.5% |
| 5 | 154192 | 13.0% |
| 3 | 137042 | 11.6% |
| 0 | 45056 | 3.8% |
| 9 | 17911 | 1.5% |
| 4 | 5525 | 0.5% |
| 7 | 4941 | 0.4% |
| 6 | 4857 | 0.4% |
| Other values (2) | 4392 | 0.4% |
| Distinct | 561 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.4 MiB |
Length
| Max length | 78 |
|---|---|
| Median length | 75 |
| Mean length | 32.954913 |
| Min length | 6 |
Unique
| Unique | 136 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Software Developers, Applications |
|---|---|
| 2nd row | Business Teachers, Postsecondary |
| 3rd row | Special Education Teachers, Kindergarten and Eleme |
| 4th row | Biological Science Teachers, Postsecondary |
| 5th row | Software Developers, Systems Software |
| Value | Count | Frequency (%) |
| software | 108942 | |
| developers | 90165 | |
| applications | 74423 | |
| systems | 34817 | 6.1% |
| analysts | 29299 | 5.1% |
| teachers | 29211 | 5.1% |
| computer | 27988 | 4.9% |
| postsecondary | 14623 | 2.6% |
| and | 13102 | 2.3% |
| special | 12188 | 2.1% |
| Other values (368) | 134272 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 493769 | 9.0% |
| 401753 | 7.3% | |
| s | 308680 | 5.6% |
| o | 294308 | 5.3% |
| S | 284668 | 5.2% |
| a | 281011 | 5.1% |
| t | 276194 | 5.0% |
| r | 254882 | 4.6% |
| p | 220955 | 4.0% |
| A | 200772 | 3.6% |
| Other values (50) | 2495640 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 5512632 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 493769 | 9.0% |
| 401753 | 7.3% | |
| s | 308680 | 5.6% |
| o | 294308 | 5.3% |
| S | 284668 | 5.2% |
| a | 281011 | 5.1% |
| t | 276194 | 5.0% |
| r | 254882 | 4.6% |
| p | 220955 | 4.0% |
| A | 200772 | 3.6% |
| Other values (50) | 2495640 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 5512632 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 493769 | 9.0% |
| 401753 | 7.3% | |
| s | 308680 | 5.6% |
| o | 294308 | 5.3% |
| S | 284668 | 5.2% |
| a | 281011 | 5.1% |
| t | 276194 | 5.0% |
| r | 254882 | 4.6% |
| p | 220955 | 4.0% |
| A | 200772 | 3.6% |
| Other values (50) | 2495640 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 5512632 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 493769 | 9.0% |
| 401753 | 7.3% | |
| s | 308680 | 5.6% |
| o | 294308 | 5.3% |
| S | 284668 | 5.2% |
| a | 281011 | 5.1% |
| t | 276194 | 5.0% |
| r | 254882 | 4.6% |
| p | 220955 | 4.0% |
| A | 200772 | 3.6% |
| Other values (50) | 2495640 |
WORK_STATE
Text
| Distinct | 57 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.5 MiB |
Length
| Max length | 24 |
|---|---|
| Median length | 20 |
| Mean length | 8.9171499 |
| Min length | 4 |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Illinois |
|---|---|
| 2nd row | California |
| 3rd row | California |
| 4th row | Texas |
| 5th row | Oregon |
| Value | Count | Frequency (%) |
| california | 46782 | |
| new | 22573 | 11.4% |
| texas | 15498 | 7.8% |
| york | 11373 | 5.7% |
| jersey | 10198 | 5.1% |
| illinois | 7411 | 3.7% |
| massachusetts | 6848 | 3.5% |
| virginia | 6338 | 3.2% |
| georgia | 5615 | 2.8% |
| pennsylvania | 4725 | 2.4% |
| Other values (54) | 61071 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 198558 | |
| i | 187697 | |
| n | 120397 | 8.1% |
| o | 116673 | 7.8% |
| r | 108632 | 7.3% |
| e | 91442 | 6.1% |
| s | 91029 | 6.1% |
| l | 83827 | 5.6% |
| C | 56218 | 3.8% |
| f | 48152 | 3.2% |
| Other values (36) | 389018 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1491643 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 198558 | |
| i | 187697 | |
| n | 120397 | 8.1% |
| o | 116673 | 7.8% |
| r | 108632 | 7.3% |
| e | 91442 | 6.1% |
| s | 91029 | 6.1% |
| l | 83827 | 5.6% |
| C | 56218 | 3.8% |
| f | 48152 | 3.2% |
| Other values (36) | 389018 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1491643 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 198558 | |
| i | 187697 | |
| n | 120397 | 8.1% |
| o | 116673 | 7.8% |
| r | 108632 | 7.3% |
| e | 91442 | 6.1% |
| s | 91029 | 6.1% |
| l | 83827 | 5.6% |
| C | 56218 | 3.8% |
| f | 48152 | 3.2% |
| Other values (36) | 389018 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1491643 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 198558 | |
| i | 187697 | |
| n | 120397 | 8.1% |
| o | 116673 | 7.8% |
| r | 108632 | 7.3% |
| e | 91442 | 6.1% |
| s | 91029 | 6.1% |
| l | 83827 | 5.6% |
| C | 56218 | 3.8% |
| f | 48152 | 3.2% |
| Other values (36) | 389018 |
| Distinct | 56 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.4 MiB |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | IL |
|---|---|
| 2nd row | CA |
| 3rd row | CA |
| 4th row | TX |
| 5th row | OR |
| Value | Count | Frequency (%) |
| ca | 46782 | |
| tx | 15498 | 9.3% |
| ny | 11373 | 6.8% |
| nj | 10198 | 6.1% |
| il | 7411 | 4.4% |
| ma | 6848 | 4.1% |
| va | 6031 | 3.6% |
| ga | 5615 | 3.4% |
| pa | 4725 | 2.8% |
| wa | 4610 | 2.8% |
| Other values (46) | 48187 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 80259 | |
| C | 56218 | |
| N | 32134 | |
| M | 20111 | 6.0% |
| T | 19559 | 5.8% |
| I | 16310 | 4.9% |
| X | 15498 | 4.6% |
| L | 12917 | 3.9% |
| Y | 12059 | 3.6% |
| J | 10198 | 3.0% |
| Other values (14) | 59293 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 334556 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| A | 80259 | |
| C | 56218 | |
| N | 32134 | |
| M | 20111 | 6.0% |
| T | 19559 | 5.8% |
| I | 16310 | 4.9% |
| X | 15498 | 4.6% |
| L | 12917 | 3.9% |
| Y | 12059 | 3.6% |
| J | 10198 | 3.0% |
| Other values (14) | 59293 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 334556 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| A | 80259 | |
| C | 56218 | |
| N | 32134 | |
| M | 20111 | 6.0% |
| T | 19559 | 5.8% |
| I | 16310 | 4.9% |
| X | 15498 | 4.6% |
| L | 12917 | 3.9% |
| Y | 12059 | 3.6% |
| J | 10198 | 3.0% |
| Other values (14) | 59293 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 334556 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| A | 80259 | |
| C | 56218 | |
| N | 32134 | |
| M | 20111 | 6.0% |
| T | 19559 | 5.8% |
| I | 16310 | 4.9% |
| X | 15498 | 4.6% |
| L | 12917 | 3.9% |
| Y | 12059 | 3.6% |
| J | 10198 | 3.0% |
| Other values (14) | 59293 |
WORK_POSTAL_CODE
Text
Missing
| Distinct | 6332 |
|---|---|
| Distinct (%) | 11.8% |
| Missing | 113604 |
| Missing (%) | 67.9% |
| Memory size | 6.7 MiB |
Length
| Max length | 12 |
|---|---|
| Median length | 5 |
| Mean length | 5.4244699 |
| Min length | 4 |
Unique
| Unique | 2617 ? |
|---|---|
| Unique (%) | 4.9% |
Sample
| 1st row | 94132.0 |
|---|---|
| 2nd row | 07417 |
| 3rd row | 83209 |
| 4th row | 11205 |
| 5th row | 85713 |
| Value | Count | Frequency (%) |
| 94043 | 2034 | 3.8% |
| 98052 | 772 | 1.4% |
| 95134.0 | 747 | 1.4% |
| 94043.0 | 666 | 1.2% |
| 94105 | 646 | 1.2% |
| 95054 | 572 | 1.1% |
| 94107 | 501 | 0.9% |
| 94103 | 451 | 0.8% |
| 92121 | 407 | 0.8% |
| 98004 | 352 | 0.7% |
| Other values (6322) | 46526 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 63094 | |
| 1 | 32883 | |
| 4 | 30824 | |
| 9 | 25868 | |
| 2 | 25846 | |
| 3 | 24682 | 8.5% |
| 5 | 22357 | 7.7% |
| 7 | 19838 | 6.8% |
| 8 | 19324 | 6.6% |
| 6 | 14966 | 5.1% |
| Other values (2) | 11471 | 3.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 291153 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 63094 | |
| 1 | 32883 | |
| 4 | 30824 | |
| 9 | 25868 | |
| 2 | 25846 | |
| 3 | 24682 | 8.5% |
| 5 | 22357 | 7.7% |
| 7 | 19838 | 6.8% |
| 8 | 19324 | 6.6% |
| 6 | 14966 | 5.1% |
| Other values (2) | 11471 | 3.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 291153 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 63094 | |
| 1 | 32883 | |
| 4 | 30824 | |
| 9 | 25868 | |
| 2 | 25846 | |
| 3 | 24682 | 8.5% |
| 5 | 22357 | 7.7% |
| 7 | 19838 | 6.8% |
| 8 | 19324 | 6.6% |
| 6 | 14966 | 5.1% |
| Other values (2) | 11471 | 3.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 291153 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 63094 | |
| 1 | 32883 | |
| 4 | 30824 | |
| 9 | 25868 | |
| 2 | 25846 | |
| 3 | 24682 | 8.5% |
| 5 | 22357 | 7.7% |
| 7 | 19838 | 6.8% |
| 8 | 19324 | 6.6% |
| 6 | 14966 | 5.1% |
| Other values (2) | 11471 | 3.9% |
FULL_TIME_POSITION_Y_N
Boolean
High correlation Imbalance Missing
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 11093 |
| Missing (%) | 6.6% |
| Memory size | 326.8 KiB |
| True | |
|---|---|
| False | 3591 |
| (Missing) | 11093 |
| Value | Count | Frequency (%) |
| True | 152594 | |
| False | 3591 | 2.1% |
| (Missing) | 11093 | 6.6% |
VISA_CLASS
Categorical
High correlation Imbalance
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.8 MiB |
| H-1B | |
|---|---|
| greencard | 11093 |
| E-3 Australian | 1393 |
| H-1B1 Singapore | 148 |
| H-1B1 Chile | 147 |
Length
| Max length | 15 |
|---|---|
| Median length | 4 |
| Mean length | 4.4307321 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | H-1B |
|---|---|
| 2nd row | greencard |
| 3rd row | H-1B |
| 4th row | H-1B |
| 5th row | E-3 Australian |
Common Values
| Value | Count | Frequency (%) |
| H-1B | 154497 | |
| greencard | 11093 | 6.6% |
| E-3 Australian | 1393 | 0.8% |
| H-1B1 Singapore | 148 | 0.1% |
| H-1B1 Chile | 147 | 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| h-1b | 154497 | |
| greencard | 11093 | 6.6% |
| e-3 | 1393 | 0.8% |
| australian | 1393 | 0.8% |
| h-1b1 | 295 | 0.2% |
| singapore | 148 | 0.1% |
| chile | 147 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 156185 | |
| 1 | 155087 | |
| H | 154792 | |
| B | 154792 | |
| r | 23727 | 3.2% |
| e | 22481 | 3.0% |
| a | 14027 | 1.9% |
| n | 12634 | 1.7% |
| g | 11241 | 1.5% |
| c | 11093 | 1.5% |
| Other values (15) | 25105 | 3.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 741164 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| - | 156185 | |
| 1 | 155087 | |
| H | 154792 | |
| B | 154792 | |
| r | 23727 | 3.2% |
| e | 22481 | 3.0% |
| a | 14027 | 1.9% |
| n | 12634 | 1.7% |
| g | 11241 | 1.5% |
| c | 11093 | 1.5% |
| Other values (15) | 25105 | 3.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 741164 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| - | 156185 | |
| 1 | 155087 | |
| H | 154792 | |
| B | 154792 | |
| r | 23727 | 3.2% |
| e | 22481 | 3.0% |
| a | 14027 | 1.9% |
| n | 12634 | 1.7% |
| g | 11241 | 1.5% |
| c | 11093 | 1.5% |
| Other values (15) | 25105 | 3.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 741164 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| - | 156185 | |
| 1 | 155087 | |
| H | 154792 | |
| B | 154792 | |
| r | 23727 | 3.2% |
| e | 22481 | 3.0% |
| a | 14027 | 1.9% |
| n | 12634 | 1.7% |
| g | 11241 | 1.5% |
| c | 11093 | 1.5% |
| Other values (15) | 25105 | 3.4% |
| Distinct | 15315 |
|---|---|
| Distinct (%) | 9.2% |
| Missing | 68 |
| Missing (%) | < 0.1% |
| Memory size | 9.9 MiB |
Length
| Max length | 9 |
|---|---|
| Median length | 5 |
| Mean length | 5.2547395 |
| Min length | 5 |
Unique
| Unique | 5469 ? |
|---|---|
| Unique (%) | 3.3% |
Sample
| 1st row | 320000 |
|---|---|
| 2nd row | 289798 |
| 3rd row | 283628 |
| 4th row | 283628 |
| 5th row | 260000 |
| Value | Count | Frequency (%) |
| 98675 | 1703 | 1.0% |
| 109762 | 1267 | 0.8% |
| 88254 | 1105 | 0.7% |
| 93267 | 1085 | 0.6% |
| 80746 | 984 | 0.6% |
| 116605 | 963 | 0.6% |
| 94162 | 932 | 0.6% |
| 100984 | 925 | 0.6% |
| 97219 | 854 | 0.5% |
| 57949 | 811 | 0.5% |
| Other values (15305) | 156581 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 102329 | |
| 6 | 100389 | |
| 1 | 92148 | |
| 7 | 91615 | |
| 4 | 89589 | |
| 8 | 86016 | |
| 5 | 79589 | |
| 9 | 79087 | |
| 2 | 78640 | |
| 3 | 70729 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 878645 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 102329 | |
| 6 | 100389 | |
| 1 | 92148 | |
| 7 | 91615 | |
| 4 | 89589 | |
| 8 | 86016 | |
| 5 | 79589 | |
| 9 | 79087 | |
| 2 | 78640 | |
| 3 | 70729 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 878645 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 102329 | |
| 6 | 100389 | |
| 1 | 92148 | |
| 7 | 91615 | |
| 4 | 89589 | |
| 8 | 86016 | |
| 5 | 79589 | |
| 9 | 79087 | |
| 2 | 78640 | |
| 3 | 70729 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 878645 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 102329 | |
| 6 | 100389 | |
| 1 | 92148 | |
| 7 | 91615 | |
| 4 | 89589 | |
| 8 | 86016 | |
| 5 | 79589 | |
| 9 | 79087 | |
| 2 | 78640 | |
| 3 | 70729 |
PAID_WAGE_PER_YEAR
Text
| Distinct | 20691 |
|---|---|
| Distinct (%) | 12.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.0 MiB |
Length
| Max length | 10 |
|---|---|
| Median length | 5 |
| Mean length | 5.3913964 |
| Min length | 5 |
Unique
| Unique | 12033 ? |
|---|---|
| Unique (%) | 7.2% |
Sample
| 1st row | 62171 |
|---|---|
| 2nd row | 91440 |
| 3rd row | 49470 |
| 4th row | 43800 |
| 5th row | 170000 |
| Value | Count | Frequency (%) |
| 60000 | 7650 | 4.6% |
| 65000 | 4397 | 2.6% |
| 70000 | 3878 | 2.3% |
| 90000 | 3407 | 2.0% |
| 100000 | 3355 | 2.0% |
| 80000 | 3348 | 2.0% |
| 75000 | 3068 | 1.8% |
| 110000 | 2791 | 1.7% |
| 85000 | 2743 | 1.6% |
| 105000 | 2609 | 1.6% |
| Other values (20681) | 130032 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 369344 | |
| 1 | 83646 | 9.3% |
| 6 | 70943 | 7.9% |
| 5 | 69818 | 7.7% |
| 7 | 58053 | 6.4% |
| 8 | 56058 | 6.2% |
| 2 | 51649 | 5.7% |
| 4 | 47943 | 5.3% |
| 9 | 46121 | 5.1% |
| 3 | 39840 | 4.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 901862 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 369344 | |
| 1 | 83646 | 9.3% |
| 6 | 70943 | 7.9% |
| 5 | 69818 | 7.7% |
| 7 | 58053 | 6.4% |
| 8 | 56058 | 6.2% |
| 2 | 51649 | 5.7% |
| 4 | 47943 | 5.3% |
| 9 | 46121 | 5.1% |
| 3 | 39840 | 4.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 901862 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 369344 | |
| 1 | 83646 | 9.3% |
| 6 | 70943 | 7.9% |
| 5 | 69818 | 7.7% |
| 7 | 58053 | 6.4% |
| 8 | 56058 | 6.2% |
| 2 | 51649 | 5.7% |
| 4 | 47943 | 5.3% |
| 9 | 46121 | 5.1% |
| 3 | 39840 | 4.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 901862 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 369344 | |
| 1 | 83646 | 9.3% |
| 6 | 70943 | 7.9% |
| 5 | 69818 | 7.7% |
| 7 | 58053 | 6.4% |
| 8 | 56058 | 6.2% |
| 2 | 51649 | 5.7% |
| 4 | 47943 | 5.3% |
| 9 | 46121 | 5.1% |
| 3 | 39840 | 4.4% |
JOB_TITLE_SUBGROUP
Categorical
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 11.7 MiB |
| software engineer | |
|---|---|
| business analyst | |
| assistant professor | |
| teacher | |
| data analyst | 3840 |
| Other values (3) | 3485 |
Length
| Max length | 21 |
|---|---|
| Median length | 17 |
| Mean length | 16.029209 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | software engineer |
|---|---|
| 2nd row | assistant professor |
| 3rd row | teacher |
| 4th row | teacher |
| 5th row | software engineer |
Common Values
| Value | Count | Frequency (%) |
| software engineer | 99364 | |
| business analyst | 27811 | 16.6% |
| assistant professor | 18866 | 11.3% |
| teacher | 13912 | 8.3% |
| data analyst | 3840 | 2.3% |
| attorney | 1488 | 0.9% |
| data scientist | 1227 | 0.7% |
| management consultant | 770 | 0.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| software | 99364 | |
| engineer | 99364 | |
| analyst | 31651 | 9.9% |
| business | 27811 | 8.7% |
| assistant | 18866 | 5.9% |
| professor | 18866 | 5.9% |
| teacher | 13912 | 4.4% |
| data | 5067 | 1.6% |
| attorney | 1488 | 0.5% |
| scientist | 1227 | 0.4% |
| Other values (2) | 1540 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 476212 | |
| s | 312002 | |
| n | 282851 | |
| r | 251860 | |
| a | 228242 | |
| t | 195466 | |
| 151878 | 5.7% | |
| i | 148495 | 5.5% |
| o | 139354 | 5.2% |
| f | 118230 | 4.4% |
| Other values (11) | 376744 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2681334 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 476212 | |
| s | 312002 | |
| n | 282851 | |
| r | 251860 | |
| a | 228242 | |
| t | 195466 | |
| 151878 | 5.7% | |
| i | 148495 | 5.5% |
| o | 139354 | 5.2% |
| f | 118230 | 4.4% |
| Other values (11) | 376744 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2681334 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 476212 | |
| s | 312002 | |
| n | 282851 | |
| r | 251860 | |
| a | 228242 | |
| t | 195466 | |
| 151878 | 5.7% | |
| i | 148495 | 5.5% |
| o | 139354 | 5.2% |
| f | 118230 | 4.4% |
| Other values (11) | 376744 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2681334 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 476212 | |
| s | 312002 | |
| n | 282851 | |
| r | 251860 | |
| a | 228242 | |
| t | 195466 | |
| 151878 | 5.7% | |
| i | 148495 | 5.5% |
| o | 139354 | 5.2% |
| f | 118230 | 4.4% |
| Other values (11) | 376744 |
order
Real number (ℝ)
High correlation Uniform Unique
| Distinct | 167278 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 83714.716 |
| Minimum | 1 |
|---|---|
| Maximum | 167361 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 8401.85 |
| Q1 | 41901.25 |
| median | 83722.5 |
| Q3 | 125541.75 |
| 95-th percentile | 158997.15 |
| Maximum | 167361 |
| Range | 167360 |
| Interquartile range (IQR) | 83640.5 |
Descriptive statistics
| Standard deviation | 48300.236 |
|---|---|
| Coefficient of variation (CV) | 0.57696231 |
| Kurtosis | -1.1996064 |
| Mean | 83714.716 |
| Median Absolute Deviation (MAD) | 41820.5 |
| Skewness | -0.0005330863 |
| Sum | 1.400363 × 1010 |
| Variance | 2.3329128 × 109 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 1 | 1 | < 0.1% |
| 111616 | 1 | < 0.1% |
| 111598 | 1 | < 0.1% |
| 111599 | 1 | < 0.1% |
| 111600 | 1 | < 0.1% |
| 111601 | 1 | < 0.1% |
| 111602 | 1 | < 0.1% |
| 111603 | 1 | < 0.1% |
| 111604 | 1 | < 0.1% |
| 111605 | 1 | < 0.1% |
| Other values (167268) | 167268 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 |
| Value | Count | Frequency (%) |
| 167361 | 1 | |
| 167360 | 1 | |
| 167359 | 1 | |
| 167358 | 1 | |
| 167357 | 1 | |
| 167356 | 1 | |
| 167355 | 1 | |
| 167354 | 1 | |
| 167353 | 1 | |
| 167352 | 1 |
Interactions
Correlations
| CASE_STATUS | EDUCATION_LEVEL_REQUIRED | EXPERIENCE_REQUIRED_NUM_MONTHS | EXPERIENCE_REQUIRED_Y_N | FULL_TIME_POSITION_Y_N | JOB_TITLE_SUBGROUP | PAID_WAGE_SUBMITTED_UNIT | PREVAILING_WAGE_SUBMITTED_UNIT | VISA_CLASS | order | |
|---|---|---|---|---|---|---|---|---|---|---|
| CASE_STATUS | 1.000 | 0.099 | 0.029 | 0.113 | 0.026 | 0.075 | 0.037 | 0.039 | 0.268 | 0.059 |
| EDUCATION_LEVEL_REQUIRED | 0.099 | 1.000 | 0.340 | 0.298 | 0.000 | 0.449 | 0.000 | 0.000 | 1.000 | 0.282 |
| EXPERIENCE_REQUIRED_NUM_MONTHS | 0.029 | 0.340 | 1.000 | 0.059 | 0.000 | 0.082 | 0.000 | 0.000 | 1.000 | -0.550 |
| EXPERIENCE_REQUIRED_Y_N | 0.113 | 0.298 | 0.059 | 1.000 | 0.000 | 0.332 | 0.000 | 0.000 | 1.000 | 0.326 |
| FULL_TIME_POSITION_Y_N | 0.026 | 0.000 | 0.000 | 0.000 | 1.000 | 0.178 | 0.642 | 0.639 | 0.003 | 0.099 |
| JOB_TITLE_SUBGROUP | 0.075 | 0.449 | 0.082 | 0.332 | 0.178 | 1.000 | 0.103 | 0.099 | 0.061 | 0.298 |
| PAID_WAGE_SUBMITTED_UNIT | 0.037 | 0.000 | 0.000 | 0.000 | 0.642 | 0.103 | 1.000 | 0.858 | 0.042 | 0.058 |
| PREVAILING_WAGE_SUBMITTED_UNIT | 0.039 | 0.000 | 0.000 | 0.000 | 0.639 | 0.099 | 0.858 | 1.000 | 0.033 | 0.056 |
| VISA_CLASS | 0.268 | 1.000 | 1.000 | 1.000 | 0.003 | 0.061 | 0.042 | 0.033 | 1.000 | 0.096 |
| order | 0.059 | 0.282 | -0.550 | 0.326 | 0.099 | 0.298 | 0.058 | 0.056 | 0.096 | 1.000 |
Missing values
Sample
| CASE_NUMBER | CASE_STATUS | CASE_RECEIVED_DATE | DECISION_DATE | EMPLOYER_NAME | PREVAILING_WAGE_SUBMITTED | PREVAILING_WAGE_SUBMITTED_UNIT | PAID_WAGE_SUBMITTED | PAID_WAGE_SUBMITTED_UNIT | JOB_TITLE | WORK_CITY | EDUCATION_LEVEL_REQUIRED | COLLEGE_MAJOR_REQUIRED | EXPERIENCE_REQUIRED_Y_N | EXPERIENCE_REQUIRED_NUM_MONTHS | COUNTRY_OF_CITIZENSHIP | PREVAILING_WAGE_SOC_CODE | PREVAILING_WAGE_SOC_TITLE | WORK_STATE | WORK_STATE_ABBREVIATION | WORK_POSTAL_CODE | FULL_TIME_POSITION_Y_N | VISA_CLASS | PREVAILING_WAGE_PER_YEAR | PAID_WAGE_PER_YEAR | JOB_TITLE_SUBGROUP | order | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | I-200-14073-248840 | denied | 3/14/2014 | 3/21/2014 | ADVANCED TECHNOLOGY GROUP USA, INC. | 6217100 | year | 62171 | year | SOFTWARE ENGINEER | BLOOMINGTON | NaN | NaN | NaN | NaN | NaN | 15-1132 | Software Developers, Applications | Illinois | IL | NaN | y | H-1B | NaN | 62171 | software engineer | 1 |
| 1 | A-15061-55212 | denied | 3/19/2015 | 3/19/2015 | SAN FRANCISCO STATE UNIVERSITY | 5067600 | year | 91440 | year | Assistant Professor of Marketing | SAN FRANCISCO | Doctorate | marketing | n | NaN | IRAN | 25-1011 | Business Teachers, Postsecondary | California | CA | 94132.0 | NaN | greencard | NaN | 91440 | assistant professor | 2 |
| 2 | I-200-13256-001092 | denied | 9/13/2013 | 9/23/2013 | CAROUSEL SCHOOL | 4947000 | year | 49470 | year | SPECIAL EDUCATION TEACHER | LOS ANGELES | NaN | NaN | NaN | NaN | NaN | 25-2052 | Special Education Teachers, Kindergarten and Eleme | California | CA | NaN | y | H-1B | NaN | 49470 | teacher | 3 |
| 3 | I-200-14087-353657 | denied | 3/28/2014 | 4/7/2014 | HARLINGEN CONSOLIDATED INDEPENDENT SCHOOL DISTRICT | 251052.00 | month | 43800 | year | SCIENCE TEACHER | HARLINGEN CISD | NaN | NaN | NaN | NaN | NaN | 25-1042 | Biological Science Teachers, Postsecondary | Texas | TX | NaN | y | H-1B | NaN | 43800 | teacher | 4 |
| 4 | I-203-14259-128844 | denied | 9/16/2014 | 9/23/2014 | SIGNAL SCIENCES CORPORATION | 84573.00 | bi-weekly | 170000 | year | SENIOR SOFTWARE ENGINEER | PORTLAND | NaN | NaN | NaN | NaN | NaN | 15-1133 | Software Developers, Systems Software | Oregon | OR | NaN | y | E-3 Australian | NaN | 170000 | software engineer | 5 |
| 5 | I-200-14092-483272 | denied | 4/2/2014 | 4/9/2014 | CAPGEMINI U.S. LLC | 113610 | month | 114421 | year | ORACLE SCM ANALYST/BUSINESS ANALYST | SOUTH SAN FRANCISCO | NaN | NaN | NaN | NaN | NaN | 15-1121 | Computer Systems Analysts | California | CA | NaN | y | H-1B | NaN | 114421 | business analyst | 6 |
| 6 | I-200-13084-487292 | denied | 3/25/2013 | 3/28/2013 | PURE STORAGE, INC. | 1333328 | year | 145000 | year | SENIOR SOFTWARE ENGINEER | MOUNTAIN VIEW | NaN | NaN | NaN | NaN | NaN | 15-1132 | Software Developers, Applications | California | CA | NaN | y | H-1B | NaN | 145000 | software engineer | 7 |
| 7 | I-200-13126-805026 | denied | 5/6/2013 | 5/8/2013 | POLMAK, INC. | 104458 | month | 104458 | year | SOFTWARE ENGINEER | RIDGEFIELD PARK | NaN | NaN | NaN | NaN | NaN | 15-1132 | Software Developers, Applications | New Jersey | NJ | NaN | y | H-1B | NaN | 104458 | software engineer | 8 |
| 8 | I-200-13128-133480 | denied | 5/10/2013 | 5/14/2013 | GOOGLE INC. | 1212002.00 | year | 160000 | year | SOFTWARE ENGINEER | NEW YORK CITY | NaN | NaN | NaN | NaN | NaN | 15-1132 | Software Developers, Applications | New York | NY | NaN | y | H-1B | NaN | 160000 | software engineer | 9 |
| 9 | I-200-14069-400950 | denied | 3/10/2014 | 3/18/2014 | STLPORT CONSULTING, INC. | 98675.00 | month | 98675 | year | SOFTWARE ENGINEER | MOUNTAIN VIEW | NaN | NaN | NaN | NaN | NaN | 15-1132 | Software Developers, Applications | California | CA | NaN | y | H-1B | NaN | 98675 | software engineer | 10 |
| CASE_NUMBER | CASE_STATUS | CASE_RECEIVED_DATE | DECISION_DATE | EMPLOYER_NAME | PREVAILING_WAGE_SUBMITTED | PREVAILING_WAGE_SUBMITTED_UNIT | PAID_WAGE_SUBMITTED | PAID_WAGE_SUBMITTED_UNIT | JOB_TITLE | WORK_CITY | EDUCATION_LEVEL_REQUIRED | COLLEGE_MAJOR_REQUIRED | EXPERIENCE_REQUIRED_Y_N | EXPERIENCE_REQUIRED_NUM_MONTHS | COUNTRY_OF_CITIZENSHIP | PREVAILING_WAGE_SOC_CODE | PREVAILING_WAGE_SOC_TITLE | WORK_STATE | WORK_STATE_ABBREVIATION | WORK_POSTAL_CODE | FULL_TIME_POSITION_Y_N | VISA_CLASS | PREVAILING_WAGE_PER_YEAR | PAID_WAGE_PER_YEAR | JOB_TITLE_SUBGROUP | order | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 167268 | I-203-15014-844277 | denied | 1/14/2015 | 1/22/2015 | RESCUE RESPONSE GEAR INC. | 1000 | month | 1000 | month | TEACHER AND INSTRUCTOR | BEND | NaN | NaN | NaN | NaN | NaN | 25-3099 | TEACHERS AND INSTRUCTORS, ALL OTHER | Oregon | OR | 97701 | n | E-3 Australian | 12000 | 12000 | teacher | 167352 |
| 167269 | I-200-14071-935389 | denied | 3/12/2014 | 3/20/2014 | SACRED HEART SCHOOL | 11500.00 | year | 22800 | year | RELIGION TEACHER | DEL RIO | NaN | NaN | NaN | NaN | NaN | 21-2021 | Directors, Religious Activities and Education | Texas | TX | NaN | y | H-1B | 11500 | 22800 | teacher | 167353 |
| 167270 | I-200-12241-089395 | certified-withdrawn | 8/28/2012 | 6/6/2013 | CHINESE BIBLE CHURCH INTERNATIONAL, INC. | 5.05 | hour | 5.6 | hour | MIDDLE SCHOOL TEACHERS | SAIPAN | NaN | NaN | NaN | NaN | NaN | 25-2022 | Middle School Teachers, Except Special and Career/ | Northern Mariana Islands | MP | NaN | y | H-1B | 10504 | 11648 | teacher | 167354 |
| 167271 | I-200-12241-745406 | certified-withdrawn | 8/28/2012 | 6/6/2013 | CHINESE BIBLE CHURCH INTERNATIONAL, INC. | 5.05 | hour | 5.6 | hour | MIDDLE SCHOOL TEACHERS | SAIPAN | NaN | NaN | NaN | NaN | NaN | 25-2022 | Middle School Teachers, Except Special and Career/ | Northern Mariana Islands | MP | NaN | y | H-1B | 10504 | 11648 | teacher | 167355 |
| 167272 | I-200-12241-762461 | certified-withdrawn | 8/28/2012 | 6/6/2013 | CHINESE BIBLE CHURCH INTERNATIONAL, INC. | 5.05 | hour | 5.6 | hour | MIDDLE SCHOOL TEACHERS | SAIPAN | NaN | NaN | NaN | NaN | NaN | 25-2022 | Middle School Teachers, Except Special and Career/ | Northern Mariana Islands | MP | NaN | y | H-1B | 10504 | 11648 | teacher | 167356 |
| 167273 | I-200-12241-209885 | certified-withdrawn | 8/28/2012 | 6/6/2013 | CHINESE BIBLE CHURCH INTERNATIONAL, INC. | 5.05 | hour | 5.6 | hour | MIDDLE SCHOOL TEACHERS | SAIPAN | NaN | NaN | NaN | NaN | NaN | 25-2022 | Middle School Teachers, Except Special and Career/ | Northern Mariana Islands | MP | NaN | y | H-1B | 10504 | 11648 | teacher | 167357 |
| 167274 | I-200-11305-143547 | denied | 11/1/2011 | 11/3/2011 | CHINESE BIBLE CHURCH INTERNATIONAL, INC. | 5,05 | hour | 5,25 | hour | PRESCHOOL TEACHER | SAIPAN | NaN | NaN | NaN | NaN | NaN | 25-2011 | Preschool Teachers, Except Special Education | Northern Mariana Islands | MP | NaN | y | H-1B | 10504 | 10920 | teacher | 167358 |
| 167275 | I-200-11313-833007 | certified | 11/9/2011 | 11/16/2011 | CHINESE BIBLE CHURCH INTERNATIONAL, INC. | 5,05 | hour | 5,25 | hour | TEACHER | SAIPAN | NaN | NaN | NaN | NaN | NaN | 25-3999 | Teachers and Instructors, All Other* | Northern Mariana Islands | MP | NaN | y | H-1B | 10504 | 10920 | teacher | 167359 |
| 167276 | I-200-11312-798611 | denied | 11/8/2011 | 11/15/2011 | CHINESE BIBLE CHURCH INTERNATIONAL, INC. | 5,05 | hour | 5,1 | hour | PRESCHOOL TEACHER | SAIPAN | NaN | NaN | NaN | NaN | NaN | 25-2011 | Preschool Teachers, Except Special Education | Northern Mariana Islands | MP | NaN | y | H-1B | 10504 | 10608 | teacher | 167360 |
| 167277 | I-200-11297-523711 | denied | 10/24/2011 | 10/26/2011 | CHINESE BIBLE CHURCH INTERNATIONAL, INC. | 5,05 | hour | 5,05 | hour | PRESCHOOL TEACHER | SAIPAN | NaN | NaN | NaN | NaN | NaN | 25-2011 | Preschool Teachers, Except Special Education | Northern Mariana Islands | MP | NaN | y | H-1B | 10504 | 10504 | teacher | 167361 |